A new LDA formulation with covariates

نویسندگان

چکیده

The Latent Dirichlet Location (LDA) model is a popular method for creating mixed-membership clusters. Despite having been originally developed text analysis, LDA has used wide range of other applications. We propose new formulation the which incorporates covariates. In this model, negative binomial regression embedded within LDA, enabling straight-forward interpretation coefficients and analysis quantity cluster-specific elements in each sampling units (instead being focused on modeling proportion cluster, as Structural Topic Models). use slice Gibbs algorithm to estimate parameters. rely simulations show how our able successfully retrieve true parameter values ability make predictions abundance matrix using information given by illustrated real data sets from three different areas: text-mining Coronavirus articles, grocery shopping baskets, ecology tree species Barro Colorado Island (Panama). This allows identification clusters discrete provides inference relationship between covariates these

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

a new type-ii fuzzy logic based controller for non-linear dynamical systems with application to 3-psp parallel robot

abstract type-ii fuzzy logic has shown its superiority over traditional fuzzy logic when dealing with uncertainty. type-ii fuzzy logic controllers are however newer and more promising approaches that have been recently applied to various fields due to their significant contribution especially when the noise (as an important instance of uncertainty) emerges. during the design of type- i fuz...

15 صفحه اول

Introducing a New Formulation for the Warehouse Inventory Management Systems: with Two Stochastic Demand Patterns

This paper presents a new formulation for warehouse inventory management in a stochastic situation. The primary source of this formulation is derived from FP model, which has been proposed by Fletcher and Ponnambalam for reservoir management. The new proposed mathematical model is based on the first and the second moments of storage as a stochastic variable. Using this model, the expected value...

متن کامل

A New Formulation for Cost-Sensitive Two Group Support Vector Machine with Multiple Error Rate

Support vector machine (SVM) is a popular classification technique which classifies data using a max-margin separator hyperplane. The normal vector and bias of the mentioned hyperplane is determined by solving a quadratic model implies that SVM training confronts by an optimization problem. Among of the extensions of SVM, cost-sensitive scheme refers to a model with multiple costs which conside...

متن کامل

Nutrigenomics: A New Approach to Food Regulation and Formulation

Nutrigenomics is the study of the effect of nutrition on gene expression which discusses how DNA is converted to mRNA and then converted mRNA to protein, and is the basis for understanding the biological activity of edible compounds. Nutritional manipulations and nutritional approaches are key tools to influence the performance and health of organisms. Today, it has been shown that better nutri...

متن کامل

A New Weighted LDA Method in Comparison to Some Versions of LDA

Linear Discrimination Analysis (LDA) is a linear solution for classification of two classes. In this paper, we propose a variant LDA method for multi-class problem which redefines the between class and within class scatter matrices by incorporating a weight function into each of them. The aim is to separate classes as much as possible in a situation that one class is well separated from other c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Communications in Statistics - Simulation and Computation

سال: 2023

ISSN: ['0361-0918', '1532-4141']

DOI: https://doi.org/10.1080/03610918.2023.2171059